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15 Abstract: 

16 Understanding the molecular basis of species formation is an important goal in 

17 evolutionary genetics, and Dobzhansky-Muller incompatibilities are thought to be a 

18 common source of postzygotic reproductive isolation between closely related lineages. 

19 However, the evolutionary forces that lead to the accumulation of such incompatibilities 

20 between diverging taxa are poorly understood. Segregation distorters are an important 

21 source of Dobzhansky-Muller incompatibilities between Drosophila species and crop plants, 

22 but it remains unclear if the contribution of these selfish genetic elements to reproductive 

23 isolation is prevalent in other species. Here, we genotype millions of single nucleotide 

24 polymorphisms across the genome from viable sperm of first-generation hybrid male 

25 progeny in a cross between Mus musculus castaneus and M. m. domesticus, two subspecies 

26 of rodent in the earliest stages of speciation. We then search for a skew in the allele 

27 frequencies of the gametes and show that segregation distorters are not measurable 

28 contributors to observed infertility in these hybrid males, despite sufficient statistical 

29 power to detect even weak segregation distortion with our novel method. Thus, reduced 

30 hybrid male fertility in crosses between these nascent species is attributable to other 

31 evolutionary forces. 
32 

33 Introduction: 

34 The Dobzhansky-Muller model (Dobzhansky 1937, Muller 1942] is widely accepted 
3 5 among evolutionary biologists as the primary explanation of the accumulation of intrinsic 

36 reproductive incompatibilities between diverging lineages (Coyne and Orr 2004, 

37 Presgraves 2010). Briefly, this model posits that genes operating normally in their native 
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38 genetic background can be dysfunctional in a hybrid background due to epistatic 

39 interactions with alleles from a divergent lineage. Although elucidating the molecular basis 

40 of speciation has been a central focus for decades, Dobzhansky-Muller incompatibilities 

41 (DMIs) have proved challenging to study because of their powerful effects on hybrid fitness 

42 (review by Coyne and Orr 2004, Noor and Feder 2006, Presgraves 2010, Wu and Ting 

43 2004). As a result, the specific genetic changes responsible for the onset of reproductive 

44 isolation between lineages remain largely obscure. 

45 The rapid evolution of selfish genetic elements is thought to be a potent source of 

46 DMIs between diverging lineages. In particular, segregation distorters are selfish elements 

47 that increase their transmission through heterozygous males by either disabling or 

48 destroying sperm that did not inherit the distorting allele (Lyttle 1991, Taylor and 

49 Ingvarsson 2003). Because males heterozygous for a distorter produce fewer viable sperm, 

50 segregation distorters can decrease the fitness of carriers. In this case, other loci in the 

51 genome are expected to evolve to suppress distortion (Hartl 1975). This coevolution of 

52 drivers and suppressors has been suggested to be a widespread source of DMIs between 

53 diverging lineages, and thus likely a contributor to reproductive isolation (Hurst and 

54 Pomiankowski 1991, Frank 1991, McDermott and Noor 2010). Indeed, there is strong 

55 evidence that segregation distorters are a primary cause of hybrid male sterility in several 

56 Drosophila species pairs (e.g. Tao et al. 2007ab, Phadnis et al. 2010, reviewed by 

57 McDermott and Noor 2010, Presgraves 2010) as well as in many crop species (e.g. Bohn 

58 and Tucker 1940, Cameron and Moav 1957, Sano et al. 1979, Loegering and Sears 1963). 

59 However, comparatively little is known about genetics of speciation in natural populations 
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60 aside from these taxa, and it remains unclear if distorters contribute to hybrid sterility in 

61 other taxa more generally. 

62 Comparative analyses aimed at identifying the genetic targets of positive selection 

63 suggest that segregation distorters may be an important source of DMIs in mammalian 

64 lineages. One particularly intriguing finding shows a substantial overrepresentation of loci 

65 associated with spermatogenesis and apoptosis within the set of genes with the strongest 

66 evidence for recurrent positive selection (e.g. Nielsen et al. 2005, Kosiol et al. 2008). These 

67 functions in turn are potentially driven at least in part by segregation distorters, which are 

68 expected to leave just such a mark of selection as they sweep through a population. 

69 Therefore, mammals are an appealing group in which to test for segregation distortion and 

70 its role in speciation. 

71 In particular, Mus musculus domesticus and M. m. castaneus are two subspecies of 

72 house mice in the earliest stages of the evolution of reproductive isolation (Boursot et al. 

73 1996, Geraldes et al. 2008). Hybrid males suffer from many reproductive deficiencies 

74 (Davis et al. 2007); specifically, they are known to have decreased testis size and to 

75 produce fewer sperm than either parental subspecies (White et al. 2012). Moreover, it has 

76 been reported that there are numerous loci that affect fertility in hybrid males and also that 

77 the vas deferens of first-generation hybrid (Fi) males contain more apoptotic sperm cells 

78 than either pure strain (White et al. 2012). In combination with comparative genomic 

79 evidence, these phenotypic observations suggest that coevolution of segregation distorters 

80 and their suppressors may contribute to DMIs in M. musculus. 

81 The conventional method of identifying segregation distortion relies on detecting a 

82 skew in the allele frequencies of the offspring of a heterozygous individual. However, 
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83 methods that rely on ge no typing progeny unavoidably conflate segregation distortion, 

84 female effects on sperm function, and differential viability. Moreover, practical issues limit 

85 the power of these experiments — specifically, the ability to produce and genotype 

86 hundreds to thousands of offspring to detect distorters of small effect, particularly in 

87 vertebrates. As a result of modest sample sizes, many experiments designed to detect 

88 distortion based on segregation in genetic crosses are underpowered and unable to detect 

89 even moderate distortion. Hence, it is challenging to study segregation distorters through a 

90 conventional crossing scheme. 

91 Here, we explore a novel approach to surveying the genome for segregation 

92 distortion by directly sequencing viable gametes from Fi hybrid M. m. domesticus/M. m. 

93 castaneus males. Briefly, we enriched for viable sperm in hybrids and then sequenced these 

94 sperm in bulk, along with control tissues, to identify any skew in the representation of 

95 either parental chromosome in the viable sperm relative to the control. While we 

96 demonstrated via simulation that our experimental design has excellent power to detect 

97 segregation distorters, we found no evidence of segregation distortion in this cross, 

98 suggesting that segregation distorters are not a primary contributor to male infertility in M. 

99 m. castaneus and M. m. domesticus hybrids. Nonetheless, this approach can be applied to a 

100 wide range of species, and we therefore expect that it will be a useful means to study the 

101 frequency and impact of segregation distortion more generally. 
102 

103 Methods: 

104 Reference Genome Assembly: To generate robust genome assemblies for each of the two 

105 strains of interest, we aligned all short read data for M. m. castaneus strain (CAST/EiJ) and 
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106 M. m. domesticus strain (WSB/EiJ) from a recent large-scale resequencing project (Keane et 

107 al 2011) to the MM9 genome assembly using BWA mem vO.7.1 (Li and Durbin 2009) for the 

108 initial mapping. For reads that failed to map with high confidence, we remapped using 

109 stampy vl.0.17 (Lunter and Goodson 2011). We realigned reads that overlap indels, and 

110 called SNPs and indels for each strain using the Genome Analysis Tool Kit (GATK, DePristo 

111 et al 2 0 1 1) . For each program, we used default parameters, except that during variant 

112 calling we used the option '— sample_ploidy 1/ because the strains are extremely inbred. 

113 We called the consensus sequence for each strain at sites where both assemblies 

114 have high quality data. That is, if both CAST and WSB assemblies had a q30 minimum 

115 quality genotype (either indels or SNPs) that site was added to both consensus sequences. 

116 Otherwise, if either or both assemblies were below this quality threshold at a given site, we 

117 recorded for each consensus the MM9 reference allele. 
118 

119 Alignment Simulation : Our goal was to align short read data to a single diploid reference 

120 genome, comprised of assemblies from the two parental strains. The mapping quality, 

121 which indicates the probability that a read is incorrectly mapped in the position indicated 

122 by the aligner, should then provide a reliable means of distinguishing whether a read can 

123 be confidently assigned to one of the parental genomes. To confirm the accuracy of this 

124 approach and to identify suitable quality thresholds, we performed simulations using 

125 SimSeq (https://github.com/jstjohn/SimSeq). We used the sequencing error profiles 

126 derived from our mapped data (below) and found qualitatively similar error rates using the 

127 default error profile included with the SimSeq software package (data not shown). For both 

128 the CAST and WSB genomes, we simulated 10,000,000 pairs of 94-bp paired-end reads, 
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129 whose size distribution was set to match that of our libraries (below). We then mapped 

130 these reads back to the single reference genome containing both CAST and WSB consensus 

131 sequences. We scored reads as 'mapping correctly' if they mapped to within 10 bp of their 

132 expected location measured by their left-most coordinate and on the correct subspecies' 

133 chromosome. If the pair mapped, we required that the insert length be less than 500 bp, 

134 which is well within three standard deviations of the mean insert size of our data and 

135 should therefore encompass the vast majority of read pairs. If both reads in a pair mapped 

136 and met our criteria above, we used the higher mapping quality of the two, and discarded 

137 the other read. This filter is important, here and below, as it avoids counting pairs as 

138 though their provenance is independent of their pair. 
139 

140 Experimental Crosses and Swim-Up Assay: To create first-generation (Fi) hybrids of Mus 

141 subspecies, we crossed 2 M. m. castaneus males to 3 M. m. domesticus females and 2 M. m. 

142 domesticus males to 5 M. m. castaneus females in a harem-mating scheme. In total, we 

143 produced 8 male Fis in each direction of the cross. Fi males whose sire was M. m. castaneus 

144 (CAST genome) are referred to as CW, and those whose sire was M. m. domesticus (WSB 

145 genome) as WC. All males were housed individually for a minimum of two weeks prior to 

146 sacrifice between 90 and 120 days of age. 

147 To enrich for viable sperm from each Fi male, we performed a standard swim up 

148 assay (Holt et al. 2010). First, immediately following sacrifice, we collected and flash-froze 

149 liver and tail control tissues (liver samples, N = 16; tail samples N = 8). We then removed 

150 and lacerated the epididymides of each male, placed this tissue in 1.5 ml of human tubal 

151 fluid (Embryomax® HTF, Millipore), and maintained the sample at a constant 37 °C for 10 
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152 minutes. Next, we isolated the supernatant, containing sperm that swam out of the 

153 epididymides, and spun this sample for 10 minutes at 250 g. We then discarded the 

154 supernatant, repeated the wash, and this time allowed sperm to swim up into the solution 

155 for an hour to select the most robust cells. Finally, we removed the solution, transferred 

156 them to new vial, pelleted these sperm by centrifugation, and froze them at -80 °C. 
157 

158 Library Preparation and Sequencing: For each Fi hybrid male, we first extracted DNA from 

159 sperm, liver, and tail tissues identically using a protocol designed to overcome the difficulty 

160 of lysing the tightly packed DNA within sperm nuclei (Qiagen Purification of total DNA from 

161 animal sperm using the DNeasy Blood & Tissue Kit; protocol 2\ We sheared this DNA by 

162 sonication to a target insert size of 300 bp using a Covaris S220, then performed blunt-end 

163 repair, adenylation, and adapter ligation following the manufacturer protocol (New 

164 England BioLabs). Following ligation, libraries were pooled into two groups of 16 and one 

165 group of 8 based on the adapter barcodes. Prior to PCR, each pool was subject to automated 

166 size selection for 450-500 bp to account for the addition of 175 bp adapter sequences, 

167 using a Pippen Prep (Sage Science) on a 2.0% agarose gel cassette. PCR was performed 

168 using six amplification cycles, and then we re-ran the size selection protocol to eliminate 

169 adapter dimer prior to sequencing. Finally, we pooled the three libraries and sequenced 

170 them on two lanes of a HiSeq 2500. Each sequencing run consisted of 100 bp paired-end 

171 reads, of which the first 6 bp are the adapter barcode sequence, and the remaining 94 bp 

172 are derived from randomly-sheared gDNA. 

173 Alignment and Read Counting: We aligned read data to the combined reference genome 

174 using 'BWA mem' as described above in the alignment simulation. We removed potential 
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175 PCR duplicates using Picard vl.73. We then filtered reads based on the alignment filtering 

176 criteria described above for the simulated data. Because copy number variations may pose 

177 problems for our analysis, we attempted to identify and exclude these regions. Specifically, 

178 we broke the genome into non-overlapping 10 kb windows. Then, within each library, we 

179 searched for 10 kb regions that had a sequencing depth greater than two standard 

180 deviations above the mean for that library. All aberrantly high-depth windows identified 

181 were excluded in downstream analyses in all libraries. These regions, representing 

182 approximately 7% of the windows in the genome, are reported in Supplemental Table SI. 

183 Next, to identify regions showing evidence of segregation distortion, we conducted 

184 windowed analyses with 1 Mb between the centers of adjacent windows. We counted reads 

185 in each window as a decreasing function of their distance from the center of the window, 

186 and included no reads at distances greater than 20 cM, thereby placing the most weight in a 

187 window on the center of the window. We then analyzed each window in two mixed-effects 

188 generalized linear models. Both models included random effects for the libraries and 

189 individuals. The first model includes no additional factors. The second had fixed effects for 

190 tissue, direction of cross, and an interaction term based on tissue by direction of cross 

191 effects, and thus has five fewer degrees of freedom than the first model. Hence, for each 

192 window, we assessed the fit of the second model relative to the first using a likelihood ratio 

193 test, wherein the log likelihood ratio should be chi-square distributed with 5 degrees of 

194 freedom. Afterwards, we applied a false-discovery rate multiple testing correction to the 

195 data (Benjamini and Hochberg 1995). We performed all statistical analyses in R (R 

196 Development Core Team 2011). 
197 
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198 Power Simulations: To estimate the power of our method, we simulated distortion data. We 

199 began by selecting sites randomly distributed across the genome, and for each site drew a 

200 distortion coefficient from a uniform distribution between -0.05 and 0.05. Each read on the 

201 parental genome that was susceptible to distortion was counted on the distorting genome 

202 with probability equal to the distortion coefficient multiplied by the probability that no 

203 recombination events occurred between the distorted locus and the read. We also did the 

204 alternative [i.e. switching reads from the distorted against genome to the distorting 

205 genome) by multiplying by the probably that a recombination event was expected to occur. 

206 We determined recombination probabilities using the genetic map reported in Cox et al. 

207 (2009). We performed the simulation for both parental genomes, and then again for each 

208 parental genome but with the distortion limited to one direction of the cross (e.g. only 

209 sperm from CW males experienced distortion). A direction-specific effect could occur if, for 

210 example, suppressing alleles are present on the Y chromosome of one subspecies and 

211 therefore are only present in CW or WC males. 
212 

213 Results 

214 After addressing the possibility of contamination, labeling, and quality issues (see 

215 Supplemental Text SI, Supplemental Table S2), we ran our analysis of the data across all 

216 autosomes, excluding regions with evidence for copy-number variations (described in 

217 Methods). With the exception of windows on chromosome 16 (see below), we found no 

218 windows with a statistically significant signature of segregation distortion. The lowest 

219 uncorrected p-value for any window (aside from those on chromosome 16) was 0.0224 

220 (Figure 1), which is not significant when we corrected for multiple tests. 
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221 By contrast, on chromosome 16, we identified 15 contiguous windows with 

222 significantly skewed allele frequencies following correction for multiple comparisons 

223 (minimum p = 5.026E-4; Figure 2). However, upon closer examination, it appears that this 

224 signal is driven almost entirely by a single liver sample, that of individual CW10. If this 

225 sample is removed from the dataset, this chromosome no longer shows significant 

226 deviation from expectations. When comparing the relative read depths across 

227 chromosomes 16 and 1, CWlO's liver sample also appears to have disproportionately lower 

228 depth on this chromosome relative to CWlO's sperm sample (p = 3.02E-5; X 2 -test). These 

229 results suggest that this pattern is likely driven by a somatic aneuploidy event in CWlO's 

230 liver that occurred relatively early in liver development and are not the result of distortion 

231 in the sperm library. 

232 One concern for the interpretation of our results is whether we have sufficient 

233 statistical power, given our experimental design, to detect segregation distortion if it is 

234 indeed occurring in hybrid males. We addressed this issue through simulation. First, for the 

235 purpose of assessing power, we selected an ad hoc significance level of a = 0.001. Given 

236 that this cutoff is substantially lower than we observed in most genomic windows, it is 

237 likely a conservative measure for assessing power. Based on our simulations, we found that 

238 we have 50% power to detect segregation distortion to approximately 0.015 (this number 

239 reflects the positive or negative deviation from the null expectation, 0.5] if distortion 

240 affects CW and WC males equally. In other words, we have 50% power to detect distortion 

241 that is greater than 51.5% or less than 48.5%. If there is directionality to the distortion 

242 effect (i.e. only CW or only WC males experience SD), we have 50% power to detect 

243 distortion of 0.017 for CW males and 0.019 for WC males. This slight difference in power 
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244 based on cross direction likely reflects differences in sequencing depth between WC and 

245 CW sperm and liver samples. It is also important to note that because read mapping and 

246 sequencing, as well as divergence between the CAST and WSB strains and their divergence 

247 from the reference genome, are non-uniform across the genome, different regions of the 

248 genome will differ slightly in power to detect distortion. 
249 

250 Discussion 

251 Elucidating the genetic mechanisms underlying species formation is a central goal of 

252 evolutionary biology. Although there has been progress in identifying genes that contribute 

253 to reproductive isolation with a few elegant examples (e.g. Bradshaw and Schemske 2003, 

254 Lassance etal. 2010, Mihola etal. 2009), several from Drosophila species (e.g. Ting etal. 

255 1998, Masly et al. 2006, Bayes and Malik 2009), it is unclear how generalizable these 

256 results are. For example, segregation distorters contribute to reproductive isolation in 

257 some young Drosophila species pairs (Phadnis and Orr 2010; Tao et al. 2007ab) but here, 

258 to our surprise, we find no evidence for segregation distortion between two nascent 

259 species M. m. castaneus/M. m. domesticus, despite strong experimental power. 

260 This conclusion however must be qualified to some degree. Segregation distorters 

261 are generally classified as either gamete disablers or gamete killers depending on their 

262 mode of action (reviewed in Lyttle 1991, Taylor and Ingvarsson 2003). We expect that 

263 gamete killers would be detected by our approach since their competitors may not be 

264 present in the epididymides. If present, these sperm would not be captured in our stringent 

265 swim up assay. Our ability to detect gamete- disablers, however, depends on the specific 

266 mechanism by which these genetic elements disable their competitors. If the motility or 



12 



Downloaded from http://biorxiv.org/on September 18, 2014 

267 longevity of a sperm cell is sufficiently impaired, it is likely that this sperm would fail to 

268 swim into solution, but if the distortion effect has a very subtle effect on motility or impairs 

269 function later in the sperm life cycle (e.g. by causing a premature acrosome reaction], it is 

270 unlikely that our method could detect these effects. Thus, although gamete killers are not 

271 prevalent sources of DMIs in these subspecies, we cannot completely exclude the 

272 possibility that gamete disablers are important in M. musculus species formation. However, 

273 it is worth nothing that disablers cannot explain the reported observation of increased 

274 apoptosis of sperm cells in hybrid males (White et al. 2012]. 

275 Conventional methods of detecting segregation distortion (i.e. genotyping progeny) 

276 are usually statistically underpowered and thus unable to detect even modest distortion 

277 effects. Moreover, requiring the presence of viable progeny unavoidably conflates viability, 

278 gamete competition, and segregation distortion effects. By contrast, sequencing high 

279 quality gametes from individual males and comparing allele ratios in these gametes to 

280 those of somatic tissues, we have excellent power to detect fairly modest segregation 

281 distorters. For example, we could detect an aneuploidy event that resulted in a 4% 

282 difference in the allele frequencies of a single individual relative to expectations. 

283 Nonetheless, we found little evidence that segregation distorters are active in Fi hybrid 

284 males, which indicates that segregation distortion (i.e. gamete killing) is not a primary 

285 contributor to reduced Fi male fertility in these subspecies. 

286 Because our method of determining the allele ratios in bulk preparations of viable 

287 gametes relative to somatic tissues is very general, we expect that it will be useful in a wide 

288 variety of systems for a diversity of questions. Provided one can accurately phase the 

289 diploid genome of an individual, by e.g. using complete parental genotype data when 
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290 inbred strains are not available, it is straightforward to apply this method to assay 

291 segregation distortion in a wide variety of taxa (including humans) and thus more easily 

292 survey the prevalence of segregation distortion as an isolating barrier both within and 

293 between species. This approach allows segregation distortion to be weighed against other 

294 possible sources of DMIs that may occur during spermatogenesis, oogenesis, fertilization, 

295 or embryogenesis, but that leaves an identical signature to SD in conventional cross-based 

296 experiments (e.g. White et ah 2012). Furthermore, extensions of our method may help to 

297 increase the generality of this approach. For example, if suitable fluorescent probes specific 

298 to cell states of interest are available, it would be straightforward to divide these cell 

299 populations using fluorescence-assisted sorting techniques, and determine the differences 

300 in allele frequencies between states. Importantly, this need not be limited to gamete cells, 

301 thus our method may have applications to a variety of other fields (e.g. cancer biology). 

302 While segregation distorters appear to be an important mechanism of speciation in 

303 Drosophila and crop plants, efforts to detect SD in other diverging lineages — especially 

304 those with high statistical power — have been limited. We find that at least in M. m. 

305 castaneus/M. m. domesticus hybrids, segregation distorters are not measurable 

306 contributors to observed infertility in Fl hybrid males, despite strong statistical to detect 

307 them, suggesting that reduced hybrid male fertility in these nascent species is attributable 

308 to other underlying genetic causes. Further studies, using the novel approach developed 

309 here, will provide a powerful way to gain more comprehensive understanding of the role of 

310 SD in the evolution of reproductive isolation between diverging lineages. 
311 
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324 Figure and Table Legends 

325 Figure 1. Average proportion CAST reads in sperm libraries versus liver libraries. Using all 

326 males (A), using only CW males (B), and using only WC males (C). Lines indicate the 

327 approximate threshold at which we would have 50% power to detect distortion at the 

328 alpha = 0.001 level (see Methods for how this threshold was calculated]. 

329 

330 Figure 2. Proportion of informative reads that are derived from the CAST genome across 

331 chromosome 16. CW4's liver sample is shown in red, and CW4's sperm sample is shown in 

332 green. All other CW libraries are represented in black for liver and in blue for sperm. 

333 

334 Figure 3. Probability of detecting segregation distortion loci based on simulations wherein 

335 distortion has no polarity (A), is in CW males only (B), or is in WC males only (C). For 
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336 visualization, all simulations are normalized to a 50:50 null expectations to account for 

337 differences in the idiosyncratic mapping properties of regions of the genome that may not 

338 conform to 50:50 expectations (see Figure 1]. 
339 

340 Supplemental Figure SI. Cartoon of experimental cross scheme. Inbred parental strains are 

341 crossed, and individual Fl males sacrificed at 4 months, when their sperm are subjected to 

342 a swim up assay. Libraries were prepared from liver, tail and sperm samples, sequenced, 

343 and then aligned to a reference genome and subspecies of origin is determined. 
344 

345 Supplemental Text SI. Supplemental methods describing quality control steps to ensure 

346 samples are not contaminated or mislabeled. 
347 

348 Supplemental Table SI. List of genomic windows excluded from all downstream analyses 

349 due to detection of individual libraries with unusually high depth. 
350 

351 Supplemental Table S2. Quality control results for the quantity of reads in each library 

352 derived from the Y chromosome, X chromosome, and mtDNA. 
353 

354 Supplemental Table S3. Alignment simulation results showing the relationship between 

355 the reported mapping quality for a read and its probability of correct assignment to the 

356 genomic location from which it was derived. 
357 
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